Patterns and Rewrite Rules for Systematic Code Generation (From High-Level Functional Patterns to High-Performance OpenCL Code)

نویسندگان

Michel Steuwer

Christian Fensch

Christophe Dubach

چکیده

Computing systems have become increasingly complex with the emergence of heterogeneous hardware combining multicore CPUs and GPUs. These parallel systems exhibit tremendous computational power at the cost of increased programming effort. This results in a tension between achieving performance and code portability. Code is either tuned using device-specific optimizations to achieve maximum performance or is written in a high-level language to achieve portability at the expense of performance. We propose a novel approach that offers high-level programming, code portability and high-performance. It is based on algorithmic pattern composition coupled with a powerful, yet simple, set of rewrite rules. This enables systematic transformation and optimization of a high-level program into a low-level hardware specific representation which leads to high performance code. We test our design in practice by describing a subset of the OpenCL programming model with low-level patterns and by implementing a compiler which generates high performance OpenCL code. Our experiments show that we can systematically derive high-performance device-specific implementations from simple high-level algorithmic expressions. The performance of the generated OpenCL code is on par with highly tuned implementations for multicore CPUs and GPUs written by experts.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High-Level Programming of Stencil Computations on Multi-GPU Systems Using the SkelCL Library

The implementation of stencil computations on modern, massively parallel systems with GPUs and other accelerators currently relies on manually-tuned coding using low-level approaches like OpenCL and CUDA. This makes development of stencil applications a complex, time-consuming, and error-prone task. We describe how stencil computations can be programmed in our SkelCL approach that combines high...

متن کامل

Geometric Algebra enhanced Precompiler for C++ and OpenCL

The focus of the this work is a simplified integration of algorithms expressed in Geometric Algebra (GA) in modern high level computer languages, namely C++, OpenCL and CUDA. A high runtime performance in terms of GA is achieved using symbolic simplification and code generation by a Precompiler that is directly integrated into CMake-based build toolchains.

متن کامل

Automatic Translation of Cuda to Opencl and Comparison of Performance Optimizations on Gpus

As an open, royalty-free framework for writing programs that execute across heterogeneous platforms, OpenCL gives programmers access to a variety of data parallel processors including CPUs, GPUs, the Cell and DSPs. All OpenCL-compliant implementations support a core specification, thus ensuring robust functional portabiity of any OpenCL program. This thesis presents the CUDAtoOpenCL source-to-s...

متن کامل

The Feasibility of Using OpenCL Instead of OpenMP for Parallel CPU Programming

OpenCL, along with CUDA, is one of the main tools used to program GPGPUs. However, it allows running the same code on multi-core CPUs too, making it a rival for the long-established OpenMP. In this paper we compare OpenCL and OpenMP when developing and running compute-heavy code on a CPU. Both ease of programming and performance aspects are considered. Since, unlike a GPU, no memory copy operat...

متن کامل

Geometric Algebra enhanced Precompiler for C++, OpenCL and Mathematica’s OpenCLLink

The focus of this work is a simplified integration of algorithms expressed in Geometric Algebra (GA) into modern high level computer languages, namely C++, OpenCL and CUDA. A high runtime performance in terms of GA is achieved using symbolic simplification and code generation by a precompiler that is directly integrated into CMake-based build toolchains. Finally, we demonstrate how to interface...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

CoRR

دوره abs/1502.02389 شماره

صفحات -

تاریخ انتشار 2015

Patterns and Rewrite Rules for Systematic Code Generation (From High-Level Functional Patterns to High-Performance OpenCL Code)

نویسندگان

چکیده

منابع مشابه

High-Level Programming of Stencil Computations on Multi-GPU Systems Using the SkelCL Library

Geometric Algebra enhanced Precompiler for C++ and OpenCL

Automatic Translation of Cuda to Opencl and Comparison of Performance Optimizations on Gpus

The Feasibility of Using OpenCL Instead of OpenMP for Parallel CPU Programming

Geometric Algebra enhanced Precompiler for C++, OpenCL and Mathematica’s OpenCLLink

عنوان ژورنال:

اشتراک گذاری